Back

BMC Medical Informatics and Decision Making

36 training papers 2019-06-25 – 2026-03-07

Top medRxiv preprints most likely to be published in this journal, ranked by match strength.

1
Fast, efficient and accurate prediction of postoperative outcomes using a small set of intraoperative time series
2024-02-29 health informatics 10.1101/2024.02.28.24303352
#1 (17.6%)
Show abstract

In the period immediately following surgery, patients are at high risk of various negative outcomes such as Acute Kidney Injury (AKI) and Myocardial Infarction (MI). Identifying patients at increased risk of developing these complications assists in their prevention and management. During surgery, rich time series data of vital signs and ventilator parameters are collected. This data holds enormous potential for the prediction of postoperative outcomes. There is, however, minimal work exploring ...

2
Class imbalance correction in artificial intelligence models leads to miscalibrated clinical predictions: a real-world evaluation
2026-03-05 health informatics 10.64898/2026.03.04.26347634
#1 (14.2%)
Show abstract

BackgroundPredictive models employing machine learning algorithms are increasingly being used in clinical decision making, and improperly calibrated models can result in systematic harm. We sought to investigate the impact of class imbalance correction, a commonly applied preprocessing step in machine learning model development, on calibration and modelled clinical decision making in a large real-world context. MethodsA histogram boosted gradient classifier was trained on a highly imbalanced na...

3
Towards personalised early prediction of Intra-Operative Hypotension following anesthesia using Deep Learning and phenotypic heterogeneity
2023-01-20 health informatics 10.1101/2023.01.20.23284432
#1 (14.1%)
Show abstract

Intra-Operative Hypotension (IOH) is a haemodynamic abnormality that is commonly observed in operating theatres following general anesthesia and associates with life-threatening post-operative complications. Using Long Short Term Memory (LSTM) models applied to Electronic Health Records (EHR) and time-series intra-operative data in 604 patients that underwent colorectal surgery we predicted the instant risk of IOH events within the next five minutes. K-means clustering was used to group patients...

4
A Neo4j-Based Framework for Integrating Clinical Data with Medical Ontologies: Performance Optimization and Quality Measure Applications in Healthcare
2025-07-21 health informatics 10.1101/2025.07.20.25322556
#1 (12.1%)
Show abstract

BackgroundElectronic Health Records face a fundamental challenge: the semantic gap between relational data storage and clinical reasoning patterns. Traditional databases struggle with complex healthcare queries requiring multiple joins and temporal analysis, creating performance bottlenecks that limit real-time clinical applications. MethodsWe developed a Neo4j-based framework integrating MIMIC-IV clinical data (1,504 patients, 4,967 admissions) with SNOMED CT medical ontology through ICD-10-CM...

5
A Machine Learning Study to Improve Surgical Case Duration Prediction
2020-06-12 health systems and quality improvement 10.1101/2020.06.10.20127910
#1 (12.0%)
Show abstract

Predictive accuracy of surgical case duration plays a critical role in reducing cost of operation room (OR) utilization. The most common approaches used by hospitals rely on historic averages based on a specific surgeon or a specific procedure type obtained from the electronic medical record (EMR) scheduling systems. However, low predictive accuracy of EMR leads to negative impacts on patients and hospitals, such as rescheduling of surgeries and cancellation. In this study, we aim to improve pre...

6
Distilling the Knowledge from Large-language Model for Health Event Prediction
2024-06-24 health informatics 10.1101/2024.06.23.24309365
#1 (11.8%)
Show abstract

Health event prediction is empowered by the rapid and wide application of electronic health records (EHR). In the Intensive Care Unit (ICU), precisely predicting the health related events in advance is essential for providing treatment and intervention to improve the patients outcomes. EHR is a kind of multi-modal data containing clinical text, time series, structured data, etc. Most health event prediction works focus on a single modality, e.g., text or tabular EHR. How to effectively learn fro...

7
Enhancing Automated Medical Coding: Evaluating Embedding Models for ICD-10-CM Code Mapping
2024-07-03 health informatics 10.1101/2024.07.02.24309849
#1 (11.7%)
Show abstract

PurposeThe goal of this study is to enhance automated medical coding (AMC) by evaluating the effectiveness of modern embedding models in capturing semantic similarity and improving the retrieval process for ICD-10-CM code mapping. Achieving consistent and accurate medical coding practices is crucial for effective healthcare management. MethodsWe compared the performance of embedding models, including text-embedding-3-large, text-embedding-004, voyage-large-2-instruct, and mistralembed, against ...

8
An Interpretable Risk Prediction Model for Healthcare with Pattern Attention
2020-07-29 health informatics 10.1101/2020.07.26.20162479
#1 (11.7%)
Show abstract

BackgroundThe availability of massive amount of data enables the possibility of clinical predictive tasks. Deep learning methods have achieved promising performance on the tasks. However, most existing methods suffer from three limitations: (i) There are lots of missing value for real value events, many methods impute the missing value and then train their models based on the imputed values, which may introduce imputation bias. The models performance is highly dependent on the imputation accurac...

9
Extraction and validation of patient housing and food insecurity status in a large electronic health records database using selective prediction and active learning
2022-12-06 health informatics 10.1101/2022.12.06.22283140
#1 (11.7%)
Show abstract

ObjectiveInformation on patient social determinants of health is frequently recorded in unstructured clinical notes, making it inaccessible for researchers and policymakers. We aimed to extract and validate food and housing insecurity status on a large electronic health record-derived patient cohort by combining selective prediction and active learning. Materials and MethodsManually labeled charts selected via active learning were used to train L1-regularized logistic regression models to ident...

10
Bagged Fuzzy-Rough Nearest Neighbors (BFRNN): A Novel Ensemble Learning Algorithm for Disease Diagnosis and Prognosis Prediction
2023-10-22 health informatics 10.1101/2023.10.21.23297353
#1 (11.6%)
Show abstract

Purpose of the study is to develop a novel machine learning (ML) algorithm that can accurately predict malignant versus benign tumors. A novel ML hybrid ensemble model called "Bagged Fuzzy-Rough k-Nearest Neighbors" (BFRNN) was developed. BFRNN is an improvement over the widely used k-Nearest Neighbors algorithm due to its use of fuzzy-rough logic and an unique ensemble voting algorithm. Initially, graphical libraries were used to visualize the Wisconsin Breast Cancer biomarker dataset (WBCBD) t...

11
KIT-LSTM: Knowledge-guided Time-aware LSTM for Continuous Clinical Risk Prediction
2022-11-18 health informatics 10.1101/2022.11.14.22282332
#1 (11.6%)
Show abstract

Rapid accumulation of temporal Electronic Health Record (EHR) data and recent advances in deep learning have shown high potential in precisely and timely predicting patients risks using AI. However, most existing risk prediction approaches ignore the complex asynchronous and irregular problems in real-world EHR data. This paper proposes a novel approach called Knowledge-guIded Time-aware LSTM (KIT-LSTM) for continuous mortality predictions using EHR. KIT-LSTM extends LSTM with two time-aware gat...

12
Minimal algorithms for knowledge representation in clinical decision support systems research: a theoretical and empirical analysis
2022-04-21 health informatics 10.1101/2022.04.20.22274099
#1 (11.4%)
Show abstract

Clinical decision support systems (CDSS) figures out as one of the most promising technologies for data-centered and AI-prompted healthcare. Its current developments are mainly guided by two disparate mindsets, namely a machine learning-centered framework and a classical rule-based framework. These respective approaches presents contrastive pros and cons. In the present study we provide an analysis showing that these two mindsets are actually related to each other, and straightforward algorithms...

13
Feasibility of converting Japanese oncology electronic medical records into the Observational Medical Outcomes Partnership Common Data Model and data quality assessment
2025-06-16 health informatics 10.1101/2025.06.13.25329609
#1 (11.3%)
Show abstract

The potential of utilizing Japanese electronic medical record (EMR) data in global observational research is significant because of high EMR adoption and universal health insurance. However, a few studies have addressed the conversion of Japanese EMR data to the Observational Medical Outcomes Partnership Common Data Model (OMOP CDM) standard, which regulates EMR data for global observational research. In this study, we investigated the feasibility of converting Japanese oncology EMR data to the ...

14
Process Mining/ Deep Learning Model to Predict Mortality in Coronary Artery Disease Patients
2024-06-27 health informatics 10.1101/2024.06.26.24309553
#1 (11.3%)
Show abstract

Patients with Coronary Artery Disease (CAD) are at high risk of death. CAD is the third leading cause of mortality worldwide. However, there is a lack of research concerning CAD patient mortality prediction; thus, more accurate prediction modeling is needed to predict the mortality of patients diagnosed with CAD. This paper demonstrates performance improvements in predicting the mortality of CAD patients. The proposed framework is a modification of the work used for the prediction of 30-day read...

15
Graph-based Fusion Modeling and Explanation for Disease Trajectory Prediction
2022-10-28 health informatics 10.1101/2022.10.25.22281469
#1 (11.3%)
Show abstract

We propose a relational graph to incorporate clinical similarity between patients while building personalized clinical event predictors with a focus on hospitalized COVID-19 patients. Our graph formation process fuses heterogeneous data, i.e., chest X-rays as node features and non-imaging EHR for edge formation. While node represents a snap-shot in time for a single patient, weighted edge structure encodes complex clinical patterns among patients. While age and gender have been used in the past ...

16
Linking the NETSARC+ national sarcoma database with the SNDS to evaluate adjuvant and/or neoadjuvant therapy: report on the linkage process and result (Health Data Hub's DEEPSARC pilot project)
2025-05-06 health informatics 10.1101/2025.05.02.25326859
#1 (11.3%)
Show abstract

BackgroundDEEPSARC, one of the first project running on the Health Data Hub aimed to identify real-life treatment regimens that could improve overall survival. The project is based on matching the national database of the sarcoma reference network with the SNDS. ObjectivesWe aimed to report a transparent description of the linking process and its results. MethodsThe sarcoma database encompasses 33,548 patients matching the selection criteria divided in three subsets: 13,507 patients with a com...

17
DICE: Deep Significance Clustering for Outcome-Driven Stratification
2020-10-04 health informatics 10.1101/2020.10.04.20204321
#1 (11.3%)
Show abstract

We present deep significance clustering (DICE), a framework for jointly performing representation learning and clustering for "outcome-driven" stratification. Motivated by practical needs in medicine to risk-stratify patients into subgroups, DICE brings self-supervision to unsupervised tasks to generate cluster membership that may be used to categorize unseen patients by risk levels. DICE is driven by a combined objective function and constraint which require a statistically significant associat...

18
Building Prediction Models for 30-Day Readmissions Among ICU Patients Using Both Structured and Unstructured Data in Electronic Health Records
2021-08-11 health informatics 10.1101/2021.08.10.21261858
#1 (11.1%)
Show abstract

ICU readmissions are associated with poor outcomes for patients and poor performance of hospitals. Patients who are readmitted have an increased risk of in-hospital deaths; hospitals with a higher readmission rate have a reduced profitability, due to an increase in cost and reduced payments from Medicare and Medicaid programs. Predicting a patients likelihood of being readmitted to the ICU can help reduce early discharges, the risk of in-hospital deaths, and help increase profitability. In this ...

19
Unlocking the Power of EHRs: Harnessing Unstructured Data for Machine Learning-based Outcome Predictions
2023-02-23 health informatics 10.1101/2023.02.13.23285873
#1 (11.1%)
Show abstract

The integration of Electronic Health Records (EHRs) with Machine Learning (ML) models has become imperative in examining patient outcomes due to the vast amounts of clinical data they provide. However, critical information regarding social and behavioral factors that affect health, such as social isolation, stress, and mental health complexities, is often recorded in unstructured clinical notes, hindering its accessibility. This has resulted in an over-reliance on clinical data in current EHR-ba...

20
Development & Deployment of a Real-time Healthcare Predictive Analytics Platform
2023-04-11 health informatics 10.1101/2023.04.10.23288373
#1 (11.1%)
Show abstract

The deployment of predictive analytic algorithms that can safely and seamlessly integrate into existing healthcare workflows remains a significant challenge. Here, we present a scalable, cloud-based, fault-tolerant platform that is capable of extracting and processing electronic health record (EHR) data for any patient at any time following admission and transferring results back into the EHR. This platform has been successfully deployed within the UC San Diego Health system and utilizes interop...